Uczelnia: Akademia Górniczo-Hutnicza im. Stanisława Staszica w Krakowie
Wydział: Wydział Geodezji Górniczej i Inżynierii Środowiska
Kierunek: Geoinformacja
Rok: drugi
Semestr: czwarty
Przedmiot: Przetwarzanie Danych Środowiskowych
Prowadzacy: dr inż. Mateusz Rzeszutek
Autorzy:
Tymoteusz Maj, 401370, tymoteuszmaj@student.agh.edu.pl
Hubert Dębowski, 407955, hdebowski@student.agh.edu.pl
Celem projektu jest zapoznanie się z analizą szeregów czasowych, wybranie nalepszego modelu i prognozy w języku R. Dane będą pochodzić z baz GIOS i ISD NOAA.
Pozyskaliśmy dane z bazy ISD NOAA oraz GIOŚ.
kat_dost <- "/Users/hdebowski/Downloads/R/Dane"
#getMeta(site = "", lat = 51.403611, lon = 21.156667,end.year = "current", plot = T, returnMap = T )
# dane <- importNOAA(code = "124880-99999", year = 2000:2019)
#mapa <- getMeta(site= "KOZIENICE", end.year="current", returnMap=TRUE)
# kozienice <- dane
# kozienice$date <- kozienice$date+3600
# kozienice <-kozienice[c(3,7:14)]
# meta <- gios_metadane(type = "stacje",
# download = F, #T za pierwsztm razem
# path = kat_dost,
# mode = "wb")
#gios_vis(data = meta %>% filter(is.na(data.zamkniecia))) # tylko aktywne stacje
# stanowiska <- gios_metadane(type = "stanowiska",
# download = F,
# path = kat_dost,
# mode = "wb")
#
# statystyki_podstawowe <- gios_metadane(type = "statystyki",
# download = F,
# path = kat_dost,
# mode = "wb")
# pliki_all <- map2(.x = as.list(zrodlo[,1]),
# .y = as.list(zrodlo[,2]),
# .f = gios_download,
# path = kat_dost,
# mode = "wb")
# wek <- 2000:2020 %>%
# as.character() %>%
# paste0(kat_dost, "/", .)
#
# pliki_all <- map(.x = wek,
# .f = dir)
# names(pliki_all) <- paste0("R",zrodlo[,2])
# names(pliki_all)
# kody <- c("MzRadTochter")
# inp_pm10_1 <- map(.x = pliki_all,
# .f = ~ .[str_detect(., pattern = "PM10_1g")]) %>% flatten_chr()
#
# PM10 <- map_df(.x = inp_pm10_1,
# .f = gios_read,
# czas_mu = "1g",
# path = kat_dost)
#
# PM10 <- gios_kody(data = PM10, meta = meta)
#
# unique(PM10$kod) %>% .[str_detect(., "MzRad")]
# Radom <- PM10 %>% filter(kod == "MzRadTochter")
# summary(Radom)
#laczenie
# Radom <- left_join(Radom, kozienice, by = "date")
# summary(Radom)
#postanowiliśmy "okroić" naszą zmienną wybierając tylko te zmienne, które realnie przydadzą nam się w naszej analizie, uznaliśy że przyśpieszy to przebieg naszej analizy oraz ułatwi nam pracę.
# Radom2 <- Radom %>%
# dplyr::select(date, obs, ws, air_temp, dew_point, atmos_pres, RH, visibility)
# summary(Radom2)
load("/Users/hdebowski/czwartek_dane.RData")Sprawdzamy kompletności danych. Po wybraniu kolumn które będą nam przydatne za pomocą funkcji vis_miss wizualizujemy braki danych. Brakujące dane są na poziomie 21.5%.
vis_miss(Radom2, warn_large_data = F)Uzupełniamy braki danych korzystając z algorytmu locf.
dane_gotowe <- na.seadec(
Radom2,
algorithm = "locf",
find_frequency = TRUE,
maxgap = Inf,
)
vis_miss(dane_gotowe, warn_large_data = F)Uśredniamy dane do postaci średniomiesięcznej oraz wybieramy interesujące nas lata do analizy - 2004-2019.
#usrednianie danych
Radom_m <- dane_gotowe %>% timeAverage(avg.time = "month")
#Radom_m2 <- Radom2 %>% timeAverage(avg.time = "month")
#Radom_m2<- Radom_m2 %>% selectByDate(year= 2004:2019)
Radom_m <- Radom_m %>% selectByDate(year= 2004:2019)
#Radom_d <- Radom2 %>% timeAverage(avg.time = "day")
#Radom_w <- Radom2 %>% timeAverage(avg.time = "week")
#plot(Radom_m)
#summary(Radom_m)Po wykonaniu uśredniania zgodnie z powyższymi parametrami nadpiszmy nasze zmienne korzystając z biblioteki tsibble.
#Radom_d <- Radom_d %>%
# mutate(date = ymd(date)) %>%
# as_tsibble(index = date)
Radom_m <- Radom_m %>%
mutate(date = yearmonth(date)) %>%
as_tsibble(index = date)Dla poniższego wykresu dostrzegalna jest sezonowość (występowanie wzrostów i spadków następuje w sposób cykliczny)
Radom_m %>% autoplot() + ggtitle("Stężenie PM10 w latach 2004-2019")+
ylab("Stężenie PM10")+
xlab("Rok")Na poniższym wykresie zauważalny jest spadek stężenia PM10 w okresach zimowych oraz spadki w sezonie letnim. Wyróżniającymi się latami są 2006 oraz … (kolor oliwkowy)(najwyższe wartości w porównaniu z pozostałymi latami)
gg_season(Radom_m,obs, labels = "both")+ ylab("PM10") +
xlab("miesiąc")+
ggtitle("Wykres sezonowy: Stezenie sredniomiesieczne PM10 w latach 2004-2019") Wykres wskazuje że największe stężenie PM10 występuje w lutym, delikatnie ustępuje mu styczeń jednakże jego średnia jest mocno zawyżona przez bardzo wysokie stężenia występujące w początkowej fazie analizowanego szeregu czasowego. Najmniejsze stężenie występuje w czerwcu. W miesiącach: maj-grudzień, luty, kwiecień widać tendencję spadkową. Styczeń oraz luty prezentują dosyć podobny zakres wartości. Poprawa parametru może wynikać z dużej uwagi przywiązywanej temu zagadnieniu. Jest mocno nagłaśniany (m.in. artykuły informujące o przekroczeniach, darmowa komunikacja gdy normy zostaną przekroczone, elektorniczne tablice pokazujące jakość powietrza i wartość stężeń głównie PM10 - taka informacja jest także przekazywana np. w Krakowiskich autobusach)
gg_subseries(Radom_m, obs, period = "year")Seria prawodopodobnie nie jest białym szumem, ponieważ występują dane poza zakresem przerywanych linii. (chyba po prostu bo jest autokorelacja.)
ACF(Radom_m, lag_max = 12) %>% autoplot() + labs(title = "Bialy szum")Bardzo wysoka korelacja między dew_point i air_temp. air_temp oraz visibility. Usuwamy temperaturę z analizę regresji ponieważ wysoka korelacja między składnikami może zaburzyć poprawność modeli.
Radom_m %>%
GGally::ggpairs(columns = 2:8)Radom_m2 <- subset(Radom_m, select=-c(air_temp))Dane podzielono na zbiór treningowy (2004-2018) oraz testowy (2019)
train <- Radom_m2 %>% filter(year(date)>=2004 & year(date) <= 2017)
test <- Radom_m2 %>% filter(year(date)>=2018 & year(date) <= 2019)Pierwszy model bazujemy na najbardziej na najbardziej skorelowanej zmiennej z PM10 czyli dew_point. R^2 wynosi: 0.4933. Wynik możemy interpretować jako że model wyjaśnia 49,33% przypadków czyli całkiem dobrze.
train %>% ggplot(aes(x = obs, y = dew_point)) +
geom_point() +
geom_smooth(method = "lm") +
ggtitle("Regresja liniowa PM10~dew_point")fit <- train %>%
model(TSLM(formula = obs~dew_point))
fit %>% report()## Series: obs
## Model: TSLM
##
## Residuals:
## Min 1Q Median 3Q Max
## -20.665 -6.444 -1.541 3.889 84.166
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 46.9423 1.1427 41.08 <2e-16 ***
## dew_point -1.8179 0.1421 -12.79 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 12.02 on 166 degrees of freedom
## Multiple R-squared: 0.4963, Adjusted R-squared: 0.4933
## F-statistic: 163.6 on 1 and 166 DF, p-value: < 2.22e-16
Następnie budujemy model którego parametrami są wszystkie zmienne. R^2 wynosi: 0.6574. Największy wpływ na model miał ws, dew_point oraz data. Model wyjaśnia 65,86% danych, zauważalny jest spory postęp.
# przyjrzyjmy się
fit_multi <- train %>%
model(TSLM(formula = obs ~ ws + dew_point + atmos_pres + RH + visibility + date))
#unemployment nie jest wazny
fit_multi %>% report()## Series: obs
## Model: TSLM
##
## Residuals:
## Min 1Q Median 3Q Max
## -18.807 -4.967 -1.251 3.630 75.134
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -3.735e+02 2.225e+02 -1.679 0.0952 .
## ws -1.332e+01 2.062e+00 -6.460 1.18e-09 ***
## dew_point -1.982e+00 2.632e-01 -7.530 3.42e-12 ***
## atmos_pres 4.912e-01 2.146e-01 2.289 0.0234 *
## RH 1.671e-01 1.639e-01 1.020 0.3094
## visibility -3.233e-04 2.672e-04 -1.210 0.2280
## date -3.093e-03 6.813e-04 -4.539 1.10e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.882 on 161 degrees of freedom
## Multiple R-squared: 0.6697, Adjusted R-squared: 0.6574
## F-statistic: 54.42 on 6 and 161 DF, p-value: < 2.22e-16
Scalamy dane z regresji prostej liniowej oraz regresji wielorakiej.
#RESZTY R NORMALNY RESZT, BRAK WZORCOW, BRAK AUTOKORELACJI, SREDNIA = 0
#jak brak autokorelacji reszt to nie mozemy stosowac przedzialow ufnosci, no chyba ze zrobimy bootstrap
# scalamy dane
bind_rows(fit %>%
augment() %>%
mutate(nazwa = "liniowy"),
fit_multi %>%
augment() %>%
mutate(nazwa = "wieloraki")
) -> dopasowaniePo nałożeniu modeli na dane możemy zaobserwować jak model radzi sobie z odwzorowaniem danych. Faktycznie model regresji lepiej radzi sobie z odwzorowaniem, zwłaszcza wartości odstających.
dopasowanie %>%
ggplot(aes(x = date)) +
geom_line(aes(y = .fitted, color = "model")) +
geom_line(aes(y = obs, color = "dane")) +
facet_wrap(~nazwa, ncol = 1) +
labs(color = "Reprezentacja")Ponownie lepsza jest regresja wielokara (większe zagęszczenie punktów przy prostej).
dopasowanie %>%
ggplot(aes(x = obs, y = .fitted, color = nazwa)) +
geom_point() +
geom_abline(intercept = 0, slope = 1) +
facet_wrap(~nazwa, ncol = 1)Zarówno dla prostej jak i wielorakiej nie występuje autokorelacja reszt. Wariancja jest jednorodna.
fit %>% gg_tsresiduals()fit_multi %>% gg_tsresiduals()W teście Shapiro-Wilka p-value mniejsze niż 0.05
shapiro.test(fit %>% residuals() %>% pull(.resid))##
## Shapiro-Wilk normality test
##
## data: fit %>% residuals() %>% pull(.resid)
## W = 0.76288, p-value = 3.418e-15
ggpubr::ggqqplot(fit %>% residuals() %>% pull(.resid))shapiro.test(fit_multi %>% residuals() %>% pull(.resid))##
## Shapiro-Wilk normality test
##
## data: fit_multi %>% residuals() %>% pull(.resid)
## W = 0.7637, p-value = 3.638e-15
ggpubr::ggqqplot(fit_multi %>% residuals() %>% pull(.resid))Jako predyktora przy regresji używamy wyłącznie trendu i sezonowości. R^2: 0.4662 czyli całkiem nieźle.
fit_Radom <- Radom_m2 %>%
model(TSLM(obs ~ trend() + season()))
fit_Radom %>% report() ## Series: obs
## Model: TSLM
##
## Residuals:
## Min 1Q Median 3Q Max
## -25.4899 -5.3774 -0.9006 3.0073 75.6065
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 60.00693 3.34761 17.925 < 2e-16 ***
## trend() -0.05404 0.01578 -3.424 0.000764 ***
## season()year2 0.72812 4.27655 0.170 0.864999
## season()year3 -7.54310 4.27664 -1.764 0.079473 .
## season()year4 -16.64150 4.27678 -3.891 0.000141 ***
## season()year5 -29.76693 4.27699 -6.960 6.21e-11 ***
## season()year6 -32.51124 4.27725 -7.601 1.59e-12 ***
## season()year7 -31.75911 4.27757 -7.425 4.43e-12 ***
## season()year8 -28.86390 4.27795 -6.747 2.02e-10 ***
## season()year9 -19.57353 4.27838 -4.575 8.87e-06 ***
## season()year10 -16.04158 4.27888 -3.749 0.000239 ***
## season()year11 -12.73619 4.27943 -2.976 0.003323 **
## season()year12 -10.36126 4.28004 -2.421 0.016484 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 12.1 on 179 degrees of freedom
## Multiple R-squared: 0.5018, Adjusted R-squared: 0.4684
## F-statistic: 15.03 on 12 and 179 DF, p-value: < 2.22e-16
Dla K=2 wartość R^2 wynosi: 0.4775.
fit_k2 <- Radom_m2 %>%
model(TSLM(obs ~ trend() + fourier(K=2)))
fit_k2 %>% report()## Series: obs
## Model: TSLM
##
## Residuals:
## Min 1Q Median 3Q Max
## -23.655 -5.945 -1.554 3.237 76.769
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 42.97267 1.73600 24.754 < 2e-16 ***
## trend() -0.05461 0.01561 -3.499 0.000584 ***
## fourier(K = 2)C1_12 15.18269 1.22143 12.430 < 2e-16 ***
## fourier(K = 2)S1_12 -0.20367 1.22272 -0.167 0.867886
## fourier(K = 2)C2_12 0.03772 1.22143 0.031 0.975400
## fourier(K = 2)S2_12 4.34692 1.22163 3.558 0.000474 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 11.97 on 186 degrees of freedom
## Multiple R-squared: 0.4934, Adjusted R-squared: 0.4797
## F-statistic: 36.23 on 5 and 186 DF, p-value: < 2.22e-16
Utworzyliśmy wszystkie kombinacje parametrów (wraz z transformacjami: log, log10, potęga) i sprawdziliśmy model dla nich. Najlepszymi kombinacjami sortując po kryterium AIC okazały się modele z transformacją log10, a najlepszym z nich log10(obs) ~ ws + atmos_pres + dew_point + trend() + season(). Bardzo dobre wyniki uzyskały również modele z transformacją log.
list(mod1 = obs ~ dew_point,
mod2 = obs ~ atmos_pres,
mod3 = obs ~ ws,
mod4 = obs ~ visibility,
mod5 = obs ~ RH,
mod6 = obs ~ dew_point + atmos_pres,
mod7 = obs ~ dew_point + ws,
mod8 = obs ~ dew_point + visibility,
mod9 = obs ~ atmos_pres + ws,
mod10 = obs ~ atmos_pres + visibility,
mod11 = obs ~ RH + visibility,
mod12 = obs ~ RH + atmos_pres,
mod13 = obs ~ RH + ws,
mod14 = obs ~ RH + dew_point,
mod15 = obs ~ ws + visibility,
mod16 = obs ~ ws + atmos_pres + visibility,
mod17 = obs ~ ws + atmos_pres + dew_point,
mod18 = obs ~ ws + atmos_pres + RH,
mod19 = obs ~ ws + visibility + dew_point,
mod20 = obs ~ ws + visibility + RH,
mod21 = obs ~ ws + dew_point + RH,
mod22 = obs ~ atmos_pres + visibility + dew_point,
mod23 = obs ~ atmos_pres + visibility + RH,
mod24 = obs ~ atmos_pres + dew_point + RH,
mod25 = obs ~ visibility + dew_point + RH,
mod26 = obs ~ ws + atmos_pres + visibility + dew_point,
mod27 = obs ~ ws + atmos_pres + dew_point + RH,
mod28 = obs ~ ws + atmos_pres + visibility + RH,
mod29 = obs ~ ws + dew_point + visibility + RH,
mod30 = obs ~ dew_point + atmos_pres + RH + visibility,
mod31 = obs ~ dew_point + atmos_pres + RH + visibility+ws,
mod32 = obs ~ dew_point + trend() + season(),
mod33 = obs ~ atmos_pres + trend() + season(),
mod34 = obs ~ ws + trend() + season(),
mod35 = obs ~ visibility + trend() + season(),
mod36 = obs ~ RH + trend() + season(),
mod37 = obs ~ dew_point + atmos_pres + trend() + season(),
mod38 = obs ~ dew_point + ws + trend() + season(),
mod39 = obs ~ dew_point + visibility + trend() + season(),
mod40 = obs ~ atmos_pres + ws + trend() + season(),
mod41 = obs ~ atmos_pres + visibility + trend() + season(),
mod42 = obs ~ RH + visibility + trend() + season(),
mod43 = obs ~ RH + atmos_pres + trend() + season(),
mod44 = obs ~ RH + ws + trend() + season(),
mod45 = obs ~ RH + dew_point + trend() + season(),
mod46 = obs ~ ws + visibility + trend() + season(),
mod47 = obs ~ ws + atmos_pres + visibility + trend() + season(),
mod48 = obs ~ ws + atmos_pres + dew_point + trend() + season(),
mod49 = obs ~ ws + atmos_pres + RH + trend() + season(),
mod50 = obs ~ ws + visibility + dew_point + trend() + season(),
mod51 = obs ~ ws + visibility + RH + trend() + season(),
mod52 = obs ~ ws + dew_point + RH + trend() + season(),
mod53 = obs ~ atmos_pres + visibility + dew_point + trend() + season(),
mod54 = obs ~ atmos_pres + visibility + RH + trend() + season(),
mod55 = obs ~ atmos_pres + dew_point + RH + trend() + season(),
mod56 = obs ~ visibility + dew_point + RH + trend() + season(),
mod57 = obs ~ ws + atmos_pres + visibility + dew_point + trend() + season(),
mod58 = obs ~ ws + atmos_pres + dew_point + RH + trend() + season(),
mod59 = obs ~ ws + atmos_pres + visibility + RH + trend() + season(),
mod60 = obs ~ ws + dew_point + visibility + RH + trend() + season(),
mod61 = obs ~ dew_point + atmos_pres + RH + visibility + trend() + season(),
mod62 = obs ~ dew_point + atmos_pres + RH + visibility+ws + trend() + season(),
mod63 = obs^2 ~ dew_point,
mod64 = obs^2 ~ atmos_pres,
mod65 = obs^2 ~ ws,
mod66 = obs^2 ~ visibility,
mod67 = obs^2 ~ RH,
mod68 = obs^2 ~ dew_point + atmos_pres,
mod69 = obs^2 ~ dew_point + ws,
mod70 = obs^2 ~ dew_point + visibility,
mod71 = obs^2 ~ atmos_pres + ws,
mod72 = obs^2 ~ atmos_pres + visibility,
mod73 = obs^2 ~ RH + visibility,
mod74 = obs^2 ~ RH + atmos_pres,
mod75 = obs^2 ~ RH + ws,
mod76 = obs^2 ~ RH + dew_point,
mod78 = obs^2 ~ ws + visibility,
mod79 = obs^2 ~ ws + atmos_pres + visibility,
mod80 = obs^2 ~ ws + atmos_pres + dew_point,
mod81 = obs^2 ~ ws + atmos_pres + RH,
mod82 = obs^2 ~ ws + visibility + dew_point,
mod83 = obs^2 ~ ws + visibility + RH,
mod84 = obs^2 ~ ws + dew_point + RH,
mod85 = obs^2 ~ atmos_pres + visibility + dew_point,
mod86 = obs^2 ~ atmos_pres + visibility + RH,
mod87 = obs^2 ~ atmos_pres + dew_point + RH,
mod88 = obs^2 ~ visibility + dew_point + RH,
mod89 = obs^2 ~ ws + atmos_pres + visibility + dew_point,
mod90 = obs^2 ~ ws + atmos_pres + dew_point + RH,
mod91 = obs^2 ~ ws + atmos_pres + visibility + RH,
mod92 = obs^2 ~ ws + dew_point + visibility + RH,
mod93 = obs^2 ~ dew_point + atmos_pres + RH + visibility,
mod94 = obs^2 ~ dew_point + atmos_pres + RH + visibility+ws,
mod95 = obs^2 ~ dew_point + trend() + season(),
mod96 = obs^2 ~ atmos_pres + trend() + season(),
mod97 = obs^2 ~ ws + trend() + season(),
mod98 = obs^2 ~ visibility + trend() + season(),
mod99 = obs^2 ~ RH + trend() + season(),
mod100 = obs^2 ~ dew_point + atmos_pres + trend() + season(),
mod38_K = obs^2 ~ dew_point + ws + trend() + season(),
mod39_K = obs^2 ~ dew_point + visibility + trend() + season(),
mod40_K = obs^2 ~ atmos_pres + ws + trend() + season(),
mod41_K = obs^2 ~ atmos_pres + visibility + trend() + season(),
mod42_K = obs^2 ~ RH + visibility + trend() + season(),
mod43_K = obs^2 ~ RH + atmos_pres + trend() + season(),
mod44_K = obs^2 ~ RH + ws + trend() + season(),
mod45_K = obs^2 ~ RH + dew_point + trend() + season(),
mod46_K = obs^2 ~ ws + visibility + trend() + season(),
mod47_K = obs^2 ~ ws + atmos_pres + visibility + trend() + season(),
mod48_K = obs^2 ~ ws + atmos_pres + dew_point + trend() + season(),
mod49_K = obs^2 ~ ws + atmos_pres + RH + trend() + season(),
mod50_K = obs^2 ~ ws + visibility + dew_point + trend() + season(),
mod51_K = obs^2 ~ ws + visibility + RH + trend() + season(),
mod52_K = obs^2 ~ ws + dew_point + RH + trend() + season(),
mod53_K = obs^2 ~ atmos_pres + visibility + dew_point + trend() + season(),
mod54_K = obs^2 ~ atmos_pres + visibility + RH + trend() + season(),
mod55_K = obs^2 ~ atmos_pres + dew_point + RH + trend() + season(),
mod56_K = obs^2 ~ visibility + dew_point + RH + trend() + season(),
mod57_K = obs^2 ~ ws + atmos_pres + visibility + dew_point + trend() + season(),
mod58_K = obs^2 ~ ws + atmos_pres + dew_point + RH + trend() + season(),
mod59_K = obs^2 ~ ws + atmos_pres + visibility + RH + trend() + season(),
mod60_K = obs^2 ~ ws + dew_point + visibility + RH + trend() + season(),
mod61_K = obs^2 ~ dew_point + atmos_pres + RH + visibility + trend() + season(),
mod62_K = obs^2 ~ dew_point + atmos_pres + RH + visibility+ws + trend() + season(),
mod1_log = log(obs) ~ dew_point,
mod2_log = log(obs) ~ atmos_pres,
mod3_log = log(obs) ~ ws,
mod4_log = log(obs) ~ visibility,
mod5_log = log(obs) ~ RH,
mod6_log = log(obs) ~ dew_point + atmos_pres,
mod7_log = log(obs) ~ dew_point + ws,
mod8_log = log(obs) ~ dew_point + visibility,
mod9_log = log(obs) ~ atmos_pres + ws,
mod10_log = log(obs) ~ atmos_pres + visibility,
mod11_log = log(obs) ~ RH + visibility,
mod12_log = log(obs) ~ RH + atmos_pres,
mod13_log = log(obs) ~ RH + ws,
mod14_log = log(obs) ~ RH + dew_point,
mod15_log = log(obs) ~ ws + visibility,
mod16_log = log(obs) ~ ws + atmos_pres + visibility,
mod17_log = log(obs) ~ ws + atmos_pres + dew_point,
mod18_log = log(obs) ~ ws + atmos_pres + RH,
mod19_log = log(obs) ~ ws + visibility + dew_point,
mod20_log = log(obs) ~ ws + visibility + RH,
mod21_log = log(obs) ~ ws + dew_point + RH,
mod22_log = log(obs) ~ atmos_pres + visibility + dew_point,
mod23_log = log(obs) ~ atmos_pres + visibility + RH,
mod24_log = log(obs) ~ atmos_pres + dew_point + RH,
mod25_log = log(obs) ~ visibility + dew_point + RH,
mod26_log = log(obs) ~ ws + atmos_pres + visibility + dew_point,
mod27_log = log(obs) ~ ws + atmos_pres + dew_point + RH,
mod28_log = log(obs) ~ ws + atmos_pres + visibility + RH,
mod29_log = log(obs) ~ ws + dew_point + visibility + RH,
mod30_log = log(obs) ~ dew_point + atmos_pres + RH + visibility,
mod31_log = log(obs) ~ dew_point + atmos_pres + RH + visibility+ws,
mod32_log = log(obs) ~ dew_point + trend() + season(),
mod33_log = log(obs) ~ atmos_pres + trend() + season(),
mod34_log = log(obs) ~ ws + trend() + season(),
mod35_log = log(obs) ~ visibility + trend() + season(),
mod36_log = log(obs) ~ RH + trend() + season(),
mod37_log = log(obs) ~ dew_point + atmos_pres + trend() + season(),
mod38_log = log(obs) ~ dew_point + ws + trend() + season(),
mod39_log = log(obs) ~ dew_point + visibility + trend() + season(),
mod40_log = log(obs) ~ atmos_pres + ws + trend() + season(),
mod41_log = log(obs) ~ atmos_pres + visibility + trend() + season(),
mod42_log = log(obs) ~ RH + visibility + trend() + season(),
mod43_log = log(obs) ~ RH + atmos_pres + trend() + season(),
mod44_log = log(obs) ~ RH + ws + trend() + season(),
mod45_log = log(obs) ~ RH + dew_point + trend() + season(),
mod46_log = log(obs) ~ ws + visibility + trend() + season(),
mod47_log = log(obs) ~ ws + atmos_pres + visibility + trend() + season(),
mod48_log = log(obs) ~ ws + atmos_pres + dew_point + trend() + season(),
mod49_log = log(obs) ~ ws + atmos_pres + RH + trend() + season(),
mod50_log = log(obs) ~ ws + visibility + dew_point + trend() + season(),
mod51_log = log(obs) ~ ws + visibility + RH + trend() + season(),
mod52_log = log(obs) ~ ws + dew_point + RH + trend() + season(),
mod53_log = log(obs) ~ atmos_pres + visibility + dew_point + trend() + season(),
mod54_log = log(obs) ~ atmos_pres + visibility + RH + trend() + season(),
mod55_log = log(obs) ~ atmos_pres + dew_point + RH + trend() + season(),
mod56_log = log(obs) ~ visibility + dew_point + RH + trend() + season(),
mod57_log = log(obs) ~ ws + atmos_pres + visibility + dew_point + trend() + season(),
mod58_log = log(obs) ~ ws + atmos_pres + dew_point + RH + trend() + season(),
mod59_log = log(obs) ~ ws + atmos_pres + visibility + RH + trend() + season(),
mod60_log = log(obs) ~ ws + dew_point + visibility + RH + trend() + season(),
mod61_log = log(obs) ~ dew_point + atmos_pres + RH + visibility + trend() + season(),
mod62_log = log(obs) ~ dew_point + atmos_pres + RH + visibility+ws + trend() + season(),
mod1_log10 = log10(obs) ~ dew_point,
mod2_log10 = log10(obs) ~ atmos_pres,
mod3_log10 = log10(obs) ~ ws,
mod4_log10 = log10(obs) ~ visibility,
mod5_log10 = log10(obs) ~ RH,
mod6_log10 = log10(obs) ~ dew_point + atmos_pres,
mod7_log10 = log10(obs) ~ dew_point + ws,
mod8_log10 = log10(obs) ~ dew_point + visibility,
mod9_log10 = log10(obs) ~ atmos_pres + ws,
mod10_log10 = log10(obs) ~ atmos_pres + visibility,
mod11_log10 = log10(obs) ~ RH + visibility,
mod12_log10 = log10(obs) ~ RH + atmos_pres,
mod13_log10 = log10(obs) ~ RH + ws,
mod14_log10 = log10(obs) ~ RH + dew_point,
mod15_log10 = log10(obs) ~ ws + visibility,
mod16_log10 = log10(obs) ~ ws + atmos_pres + visibility,
mod17_log10 = log10(obs) ~ ws + atmos_pres + dew_point,
mod18_log10 = log10(obs) ~ ws + atmos_pres + RH,
mod19_log10 = log10(obs) ~ ws + visibility + dew_point,
mod20_log10 = log10(obs) ~ ws + visibility + RH,
mod21_log10 = log10(obs) ~ ws + dew_point + RH,
mod22_log10 = log10(obs) ~ atmos_pres + visibility + dew_point,
mod23_log10 = log10(obs) ~ atmos_pres + visibility + RH,
mod24_log10 = log10(obs) ~ atmos_pres + dew_point + RH,
mod25_log10 = log10(obs) ~ visibility + dew_point + RH,
mod26_log10 = log10(obs) ~ ws + atmos_pres + visibility + dew_point,
mod27_log10 = log10(obs) ~ ws + atmos_pres + dew_point + RH,
mod28_log10 = log10(obs) ~ ws + atmos_pres + visibility + RH,
mod29_log10 = log10(obs) ~ ws + dew_point + visibility + RH,
mod30_log10 = log10(obs) ~ dew_point + atmos_pres + RH + visibility,
mod31_log10 = log10(obs) ~ dew_point + atmos_pres + RH + visibility+ws,
mod32_log10 = log10(obs) ~ dew_point + trend() + season(),
mod33_log10 = log10(obs) ~ atmos_pres + trend() + season(),
mod34_log10 = log10(obs) ~ ws + trend() + season(),
mod35_log10 = log10(obs) ~ visibility + trend() + season(),
mod36_log10 = log10(obs) ~ RH + trend() + season(),
mod37_log10 = log10(obs) ~ dew_point + atmos_pres + trend() + season(),
mod38_log10 = log10(obs) ~ dew_point + ws + trend() + season(),
mod39_log10 = log10(obs) ~ dew_point + visibility + trend() + season(),
mod40_log10 = log10(obs) ~ atmos_pres + ws + trend() + season(),
mod41_log10 = log10(obs) ~ atmos_pres + visibility + trend() + season(),
mod42_log10 = log10(obs) ~ RH + visibility + trend() + season(),
mod43_log10 = log10(obs) ~ RH + atmos_pres + trend() + season(),
mod44_log10 = log10(obs) ~ RH + ws + trend() + season(),
mod45_log10 = log10(obs) ~ RH + dew_point + trend() + season(),
mod46_log10 = log10(obs) ~ ws + visibility + trend() + season(),
mod47_log10 = log10(obs) ~ ws + atmos_pres + visibility + trend() + season(),
mod48_log10 = log10(obs) ~ ws + atmos_pres + dew_point + trend() + season(),
mod49_log10 = log10(obs) ~ ws + atmos_pres + RH + trend() + season(),
mod50_log10 = log10(obs) ~ ws + visibility + dew_point + trend() + season(),
mod51_log10 = log10(obs) ~ ws + visibility + RH + trend() + season(),
mod52_log10 = log10(obs) ~ ws + dew_point + RH + trend() + season(),
mod53_log10 = log10(obs) ~ atmos_pres + visibility + dew_point + trend() + season(),
mod54_log10 = log10(obs) ~ atmos_pres + visibility + RH + trend() + season(),
mod55_log10 = log10(obs) ~ atmos_pres + dew_point + RH + trend() + season(),
mod56_log10 = log10(obs) ~ visibility + dew_point + RH + trend() + season(),
mod57_log10 = log10(obs) ~ ws + atmos_pres + visibility + dew_point + trend() + season(),
mod58_log10 = log10(obs) ~ ws + atmos_pres + dew_point + RH + trend() + season(),
mod59_log10 = log10(obs) ~ ws + atmos_pres + visibility + RH + trend() + season(),
mod60_log10 = log10(obs) ~ ws + dew_point + visibility + RH + trend() + season(),
mod61_log10 = log10(obs) ~ dew_point + atmos_pres + RH + visibility + trend() + season(),
mod62_log10 = log10(obs) ~ dew_point + atmos_pres + RH + visibility+ws + trend() + season())-> formy
map(formy, as.formula) -> formy
out <- map(.x = formy,
.f = ~model(train,
TSLM(formula = .x)) %>%
glance()
) %>%
do.call(rbind, .) %>%
dplyr::select(.model, adj_r_squared, CV, AIC, AICc, BIC)
out %>%
mutate(.model = formy) %>%
arrange(AIC) %>%
knitr::kable(digits = 2)| .model | adj_r_squared | CV | AIC | AICc | BIC |
|---|---|---|---|---|---|
| log10(obs) ~ ws + atmos_pres + dew_point + trend() + season() | 0.77 | 0.01 | -822.58 | -818.50 | -769.47 |
| log10(obs) ~ ws + atmos_pres + visibility + dew_point + trend() + , season() | 0.77 | 0.01 | -822.53 | -817.94 | -766.30 |
| log10(obs) ~ dew_point + atmos_pres + RH + visibility + ws + , trend() + season() | 0.77 | 0.01 | -822.03 | -816.89 | -762.67 |
| log10(obs) ~ ws + dew_point + visibility + RH + trend() + season() | 0.77 | 0.01 | -821.53 | -816.94 | -765.30 |
| log10(obs) ~ ws + atmos_pres + dew_point + RH + trend() + season() | 0.76 | 0.01 | -820.92 | -816.33 | -764.69 |
| log10(obs) ~ dew_point + ws + trend() + season() | 0.76 | 0.01 | -820.35 | -816.75 | -770.37 |
| log10(obs) ~ ws + dew_point + RH + trend() + season() | 0.76 | 0.01 | -820.11 | -816.03 | -767.00 |
| log10(obs) ~ ws + visibility + dew_point + trend() + season() | 0.76 | 0.01 | -819.72 | -815.64 | -766.61 |
| log10(obs) ~ ws + atmos_pres + visibility + RH + trend() + season() | 0.76 | 0.01 | -817.56 | -812.97 | -761.33 |
| log10(obs) ~ ws + visibility + RH + trend() + season() | 0.76 | 0.01 | -816.69 | -812.61 | -763.59 |
| log10(obs) ~ ws + atmos_pres + visibility + trend() + season() | 0.76 | 0.01 | -815.27 | -811.19 | -762.16 |
| log10(obs) ~ atmos_pres + ws + trend() + season() | 0.75 | 0.01 | -814.76 | -811.16 | -764.77 |
| log10(obs) ~ ws + atmos_pres + RH + trend() + season() | 0.75 | 0.01 | -814.61 | -810.53 | -761.51 |
| log10(obs) ~ RH + ws + trend() + season() | 0.75 | 0.01 | -813.25 | -809.65 | -763.26 |
| log10(obs) ~ ws + trend() + season() | 0.74 | 0.01 | -809.86 | -806.70 | -763.00 |
| log10(obs) ~ ws + visibility + trend() + season() | 0.75 | 0.01 | -809.60 | -806.00 | -759.62 |
| log10(obs) ~ dew_point + atmos_pres + RH + visibility + ws | 0.73 | 0.01 | -807.81 | -807.11 | -785.94 |
| log10(obs) ~ ws + dew_point + visibility + RH | 0.73 | 0.01 | -806.31 | -805.79 | -787.56 |
| log10(obs) ~ ws + atmos_pres + visibility + dew_point | 0.72 | 0.01 | -803.57 | -803.05 | -784.83 |
| log10(obs) ~ ws + visibility + dew_point | 0.71 | 0.01 | -800.62 | -800.25 | -785.00 |
| log10(obs) ~ dew_point + atmos_pres + RH + visibility | 0.70 | 0.01 | -794.09 | -793.57 | -775.34 |
| log10(obs) ~ atmos_pres + visibility + dew_point | 0.70 | 0.01 | -790.38 | -790.01 | -774.76 |
| log10(obs) ~ atmos_pres + visibility + dew_point + trend() + , season() | 0.71 | 0.01 | -787.95 | -783.87 | -734.84 |
| log10(obs) ~ visibility + dew_point + RH | 0.69 | 0.01 | -787.64 | -787.27 | -772.02 |
| log10(obs) ~ dew_point + atmos_pres + RH + visibility + trend() + , season() | 0.71 | 0.01 | -786.27 | -781.68 | -730.04 |
| log10(obs) ~ dew_point + visibility | 0.68 | 0.01 | -781.82 | -781.57 | -769.32 |
| log10(obs) ~ visibility + dew_point + RH + trend() + season() | 0.70 | 0.01 | -780.48 | -776.40 | -727.38 |
| log10(obs) ~ dew_point + visibility + trend() + season() | 0.70 | 0.01 | -779.60 | -776.00 | -729.62 |
| log10(obs) ~ atmos_pres + visibility + RH + trend() + season() | 0.69 | 0.01 | -776.76 | -772.68 | -723.65 |
| log10(obs) ~ ws + atmos_pres + visibility + RH | 0.67 | 0.01 | -776.51 | -775.99 | -757.77 |
| log10(obs) ~ atmos_pres + visibility + RH | 0.67 | 0.01 | -776.43 | -776.06 | -760.81 |
| log10(obs) ~ atmos_pres + visibility + trend() + season() | 0.69 | 0.01 | -776.15 | -772.55 | -726.17 |
| log10(obs) ~ ws + visibility + RH | 0.66 | 0.01 | -772.78 | -772.41 | -757.16 |
| log10(obs) ~ ws + atmos_pres + dew_point + RH | 0.66 | 0.01 | -771.46 | -770.94 | -752.72 |
| log10(obs) ~ RH + visibility | 0.66 | 0.01 | -770.63 | -770.38 | -758.13 |
| log10(obs) ~ atmos_pres + dew_point + RH + trend() + season() | 0.68 | 0.01 | -770.24 | -766.16 | -717.13 |
| log10(obs) ~ dew_point + atmos_pres + trend() + season() | 0.68 | 0.01 | -769.81 | -766.21 | -719.83 |
| log10(obs) ~ RH + visibility + trend() + season() | 0.68 | 0.01 | -769.46 | -765.86 | -719.48 |
| log10(obs) ~ ws + atmos_pres + dew_point | 0.65 | 0.01 | -767.57 | -767.20 | -751.95 |
| log10(obs) ~ ws + dew_point + RH | 0.65 | 0.01 | -765.97 | -765.60 | -750.35 |
| log10(obs) ~ atmos_pres + visibility | 0.64 | 0.01 | -765.02 | -764.78 | -752.53 |
| log10(obs) ~ ws + atmos_pres + visibility | 0.64 | 0.01 | -763.82 | -763.45 | -748.20 |
| log10(obs) ~ dew_point + ws | 0.64 | 0.01 | -762.77 | -762.53 | -750.28 |
| log10(obs) ~ dew_point + trend() + season() | 0.66 | 0.01 | -762.49 | -759.33 | -715.63 |
| log10(obs) ~ visibility + trend() + season() | 0.66 | 0.01 | -761.81 | -758.65 | -714.95 |
| log10(obs) ~ atmos_pres + dew_point + RH | 0.64 | 0.01 | -760.74 | -760.37 | -745.12 |
| log10(obs) ~ RH + dew_point + trend() + season() | 0.66 | 0.01 | -760.67 | -757.07 | -710.69 |
| log10(obs) ~ dew_point + atmos_pres | 0.63 | 0.01 | -757.25 | -757.01 | -744.76 |
| log10(obs) ~ ws + visibility | 0.63 | 0.01 | -756.53 | -756.28 | -744.03 |
| log10(obs) ~ visibility | 0.62 | 0.01 | -756.09 | -755.94 | -746.72 |
| log10(obs) ~ atmos_pres + trend() + season() | 0.64 | 0.01 | -753.19 | -750.03 | -706.33 |
| log10(obs) ~ RH + atmos_pres + trend() + season() | 0.64 | 0.01 | -751.59 | -747.99 | -701.61 |
| log10(obs) ~ RH + dew_point | 0.61 | 0.01 | -749.11 | -748.87 | -736.62 |
| log10(obs) ~ dew_point | 0.60 | 0.01 | -746.63 | -746.49 | -737.26 |
| log10(obs) ~ RH + trend() + season() | 0.61 | 0.01 | -738.03 | -734.87 | -691.17 |
| log10(obs) ~ ws + atmos_pres + RH | 0.34 | 0.02 | -658.92 | -658.55 | -643.30 |
| log10(obs) ~ RH + atmos_pres | 0.29 | 0.02 | -649.71 | -649.46 | -637.21 |
| log10(obs) ~ RH + ws | 0.22 | 0.02 | -633.76 | -633.51 | -621.26 |
| log10(obs) ~ atmos_pres + ws | 0.22 | 0.02 | -633.10 | -632.85 | -620.60 |
| log10(obs) ~ RH | 0.20 | 0.02 | -628.74 | -628.59 | -619.36 |
| log10(obs) ~ atmos_pres | 0.12 | 0.03 | -613.61 | -613.46 | -604.24 |
| log10(obs) ~ ws | 0.08 | 0.03 | -606.82 | -606.67 | -597.45 |
| log(obs) ~ ws + atmos_pres + dew_point + trend() + season() | 0.77 | 0.04 | -542.35 | -538.27 | -489.24 |
| log(obs) ~ ws + atmos_pres + visibility + dew_point + trend() + , season() | 0.77 | 0.04 | -542.30 | -537.71 | -486.07 |
| log(obs) ~ dew_point + atmos_pres + RH + visibility + ws + trend() + , season() | 0.77 | 0.04 | -541.79 | -536.66 | -482.44 |
| log(obs) ~ ws + dew_point + visibility + RH + trend() + season() | 0.77 | 0.04 | -541.30 | -536.71 | -485.07 |
| log(obs) ~ ws + atmos_pres + dew_point + RH + trend() + season() | 0.76 | 0.04 | -540.69 | -536.10 | -484.46 |
| log(obs) ~ dew_point + ws + trend() + season() | 0.76 | 0.04 | -540.11 | -536.51 | -490.13 |
| log(obs) ~ ws + dew_point + RH + trend() + season() | 0.76 | 0.04 | -539.88 | -535.80 | -486.77 |
| log(obs) ~ ws + visibility + dew_point + trend() + season() | 0.76 | 0.04 | -539.48 | -535.40 | -486.37 |
| log(obs) ~ ws + atmos_pres + visibility + RH + trend() + season() | 0.76 | 0.04 | -537.32 | -532.73 | -481.09 |
| log(obs) ~ ws + visibility + RH + trend() + season() | 0.76 | 0.04 | -536.46 | -532.38 | -483.35 |
| log(obs) ~ ws + atmos_pres + visibility + trend() + season() | 0.76 | 0.04 | -535.04 | -530.96 | -481.93 |
| log(obs) ~ atmos_pres + ws + trend() + season() | 0.75 | 0.04 | -534.52 | -530.92 | -484.54 |
| log(obs) ~ ws + atmos_pres + RH + trend() + season() | 0.75 | 0.04 | -534.38 | -530.30 | -481.27 |
| log(obs) ~ RH + ws + trend() + season() | 0.75 | 0.04 | -533.01 | -529.41 | -483.03 |
| log(obs) ~ ws + trend() + season() | 0.74 | 0.04 | -529.62 | -526.46 | -482.76 |
| log(obs) ~ ws + visibility + trend() + season() | 0.75 | 0.04 | -529.37 | -525.76 | -479.38 |
| log(obs) ~ dew_point + atmos_pres + RH + visibility + ws | 0.73 | 0.04 | -527.57 | -526.87 | -505.70 |
| log(obs) ~ ws + dew_point + visibility + RH | 0.73 | 0.04 | -526.07 | -525.55 | -507.33 |
| log(obs) ~ ws + atmos_pres + visibility + dew_point | 0.72 | 0.04 | -523.34 | -522.81 | -504.59 |
| log(obs) ~ ws + visibility + dew_point | 0.71 | 0.04 | -520.39 | -520.02 | -504.77 |
| log(obs) ~ dew_point + atmos_pres + RH + visibility | 0.70 | 0.05 | -513.85 | -513.33 | -495.11 |
| log(obs) ~ atmos_pres + visibility + dew_point | 0.70 | 0.05 | -510.14 | -509.77 | -494.52 |
| log(obs) ~ atmos_pres + visibility + dew_point + trend() + season() | 0.71 | 0.05 | -507.72 | -503.64 | -454.61 |
| log(obs) ~ visibility + dew_point + RH | 0.69 | 0.05 | -507.41 | -507.04 | -491.79 |
| log(obs) ~ dew_point + atmos_pres + RH + visibility + trend() + , season() | 0.71 | 0.05 | -506.04 | -501.45 | -449.81 |
| log(obs) ~ dew_point + visibility | 0.68 | 0.05 | -501.58 | -501.34 | -489.09 |
| log(obs) ~ visibility + dew_point + RH + trend() + season() | 0.70 | 0.05 | -500.25 | -496.17 | -447.14 |
| log(obs) ~ dew_point + visibility + trend() + season() | 0.70 | 0.05 | -499.37 | -495.77 | -449.39 |
| log(obs) ~ atmos_pres + visibility + RH + trend() + season() | 0.69 | 0.05 | -496.53 | -492.45 | -443.42 |
| log(obs) ~ ws + atmos_pres + visibility + RH | 0.67 | 0.05 | -496.28 | -495.75 | -477.53 |
| log(obs) ~ atmos_pres + visibility + RH | 0.67 | 0.05 | -496.20 | -495.83 | -480.58 |
| log(obs) ~ atmos_pres + visibility + trend() + season() | 0.69 | 0.05 | -495.92 | -492.32 | -445.94 |
| log(obs) ~ ws + visibility + RH | 0.66 | 0.05 | -492.54 | -492.17 | -476.92 |
| log(obs) ~ ws + atmos_pres + dew_point + RH | 0.66 | 0.05 | -491.23 | -490.71 | -472.49 |
| log(obs) ~ RH + visibility | 0.66 | 0.05 | -490.39 | -490.15 | -477.90 |
| log(obs) ~ atmos_pres + dew_point + RH + trend() + season() | 0.68 | 0.05 | -490.00 | -485.92 | -436.90 |
| log(obs) ~ dew_point + atmos_pres + trend() + season() | 0.68 | 0.05 | -489.57 | -485.97 | -439.59 |
| log(obs) ~ RH + visibility + trend() + season() | 0.68 | 0.05 | -489.23 | -485.62 | -439.24 |
| log(obs) ~ ws + atmos_pres + dew_point | 0.65 | 0.05 | -487.33 | -486.96 | -471.71 |
| log(obs) ~ ws + dew_point + RH | 0.65 | 0.05 | -485.73 | -485.36 | -470.11 |
| log(obs) ~ atmos_pres + visibility | 0.64 | 0.06 | -484.79 | -484.54 | -472.29 |
| log(obs) ~ ws + atmos_pres + visibility | 0.64 | 0.06 | -483.58 | -483.21 | -467.96 |
| log(obs) ~ dew_point + ws | 0.64 | 0.06 | -482.54 | -482.29 | -470.04 |
| log(obs) ~ dew_point + trend() + season() | 0.66 | 0.06 | -482.26 | -479.10 | -435.40 |
| log(obs) ~ visibility + trend() + season() | 0.66 | 0.06 | -481.58 | -478.42 | -434.72 |
| log(obs) ~ atmos_pres + dew_point + RH | 0.64 | 0.06 | -480.50 | -480.13 | -464.88 |
| log(obs) ~ RH + dew_point + trend() + season() | 0.66 | 0.06 | -480.44 | -476.83 | -430.45 |
| log(obs) ~ dew_point + atmos_pres | 0.63 | 0.06 | -477.02 | -476.77 | -464.52 |
| log(obs) ~ ws + visibility | 0.63 | 0.06 | -476.29 | -476.05 | -463.80 |
| log(obs) ~ visibility | 0.62 | 0.06 | -475.86 | -475.71 | -466.48 |
| log(obs) ~ atmos_pres + trend() + season() | 0.64 | 0.06 | -472.96 | -469.80 | -426.10 |
| log(obs) ~ RH + atmos_pres + trend() + season() | 0.64 | 0.06 | -471.36 | -467.76 | -421.38 |
| log(obs) ~ RH + dew_point | 0.61 | 0.06 | -468.88 | -468.63 | -456.38 |
| log(obs) ~ dew_point | 0.60 | 0.06 | -466.40 | -466.25 | -457.03 |
| log(obs) ~ RH + trend() + season() | 0.61 | 0.07 | -457.79 | -454.63 | -410.93 |
| log(obs) ~ ws + atmos_pres + RH | 0.34 | 0.10 | -378.68 | -378.31 | -363.06 |
| log(obs) ~ RH + atmos_pres | 0.29 | 0.11 | -369.47 | -369.23 | -356.98 |
| log(obs) ~ RH + ws | 0.22 | 0.12 | -353.52 | -353.28 | -341.03 |
| log(obs) ~ atmos_pres + ws | 0.22 | 0.12 | -352.86 | -352.62 | -340.37 |
| log(obs) ~ RH | 0.20 | 0.12 | -348.50 | -348.35 | -339.13 |
| log(obs) ~ atmos_pres | 0.12 | 0.14 | -333.37 | -333.23 | -324.00 |
| log(obs) ~ ws | 0.08 | 0.14 | -326.58 | -326.44 | -317.21 |
| obs ~ ws + atmos_pres + dew_point + RH + trend() + season() | 0.67 | 110.05 | 783.69 | 788.28 | 839.92 |
| obs ~ ws + atmos_pres + dew_point + trend() + season() | 0.66 | 109.12 | 784.73 | 788.81 | 837.84 |
| obs ~ ws + atmos_pres + visibility + dew_point + trend() + season() | 0.66 | 108.80 | 784.88 | 789.47 | 841.11 |
| obs ~ dew_point + atmos_pres + RH + visibility + ws + trend() + , season() | 0.66 | 110.60 | 785.12 | 790.26 | 844.48 |
| obs ~ dew_point + ws + trend() + season() | 0.65 | 110.17 | 787.54 | 791.14 | 837.52 |
| obs ~ ws + visibility + dew_point + trend() + season() | 0.65 | 110.15 | 788.29 | 792.37 | 841.39 |
| obs ~ ws + dew_point + RH + trend() + season() | 0.65 | 111.76 | 789.00 | 793.08 | 842.11 |
| obs ~ ws + dew_point + visibility + RH + trend() + season() | 0.65 | 112.03 | 790.18 | 794.77 | 846.41 |
| obs ~ ws + atmos_pres + visibility + dew_point | 0.62 | 115.28 | 795.58 | 796.10 | 814.32 |
| obs ~ dew_point + atmos_pres + RH + visibility + ws | 0.62 | 116.14 | 796.76 | 797.46 | 818.63 |
| obs ~ ws + atmos_pres + visibility + trend() + season() | 0.64 | 116.65 | 796.87 | 800.95 | 849.97 |
| obs ~ atmos_pres + ws + trend() + season() | 0.63 | 117.49 | 797.37 | 800.97 | 847.35 |
| obs ~ ws + visibility + dew_point | 0.61 | 116.15 | 798.52 | 798.89 | 814.14 |
| obs ~ ws + atmos_pres + visibility + RH + trend() + season() | 0.63 | 119.33 | 798.86 | 803.45 | 855.09 |
| obs ~ ws + atmos_pres + RH + trend() + season() | 0.63 | 119.89 | 798.98 | 803.06 | 852.09 |
| obs ~ ws + dew_point + visibility + RH | 0.61 | 116.37 | 799.05 | 799.57 | 817.80 |
| obs ~ ws + trend() + season() | 0.62 | 119.85 | 803.51 | 806.67 | 850.37 |
| obs ~ ws + visibility + trend() + season() | 0.62 | 119.45 | 803.84 | 807.44 | 853.82 |
| obs ~ ws + visibility + RH + trend() + season() | 0.62 | 120.24 | 804.44 | 808.52 | 857.55 |
| obs ~ RH + ws + trend() + season() | 0.62 | 121.36 | 805.10 | 808.70 | 855.08 |
| obs ~ ws + atmos_pres + dew_point + RH | 0.58 | 124.55 | 808.98 | 809.50 | 827.72 |
| obs ~ dew_point + atmos_pres + RH + visibility + trend() + season() | 0.60 | 130.08 | 812.20 | 816.79 | 868.43 |
| obs ~ ws + atmos_pres + dew_point | 0.57 | 126.34 | 812.21 | 812.58 | 827.83 |
| obs ~ atmos_pres + visibility + dew_point + trend() + season() | 0.60 | 129.11 | 813.05 | 817.13 | 866.16 |
| obs ~ atmos_pres + visibility + dew_point | 0.57 | 128.85 | 813.43 | 813.80 | 829.05 |
| obs ~ ws + dew_point + RH | 0.57 | 127.43 | 814.12 | 814.49 | 829.74 |
| obs ~ dew_point + atmos_pres + RH + visibility | 0.57 | 129.86 | 814.70 | 815.22 | 833.45 |
| obs ~ dew_point + ws | 0.56 | 128.82 | 816.73 | 816.98 | 829.23 |
| obs ~ atmos_pres + dew_point + RH + trend() + season() | 0.58 | 135.52 | 819.52 | 823.60 | 872.62 |
| obs ~ dew_point + visibility + trend() + season() | 0.58 | 134.75 | 821.77 | 825.37 | 871.75 |
| obs ~ dew_point + visibility | 0.54 | 133.94 | 822.71 | 822.96 | 835.21 |
| obs ~ visibility + dew_point + RH | 0.55 | 133.93 | 822.94 | 823.31 | 838.56 |
| obs ~ visibility + dew_point + RH + trend() + season() | 0.57 | 137.03 | 823.67 | 827.75 | 876.78 |
| obs ~ atmos_pres + dew_point + RH | 0.54 | 138.24 | 825.46 | 825.83 | 841.08 |
| obs ~ ws + atmos_pres + visibility + RH | 0.54 | 137.92 | 825.84 | 826.36 | 844.59 |
| obs ~ dew_point + atmos_pres | 0.53 | 139.92 | 828.17 | 828.41 | 840.67 |
| obs ~ atmos_pres + visibility + RH | 0.53 | 140.45 | 828.20 | 828.57 | 843.82 |
| obs ~ dew_point + atmos_pres + trend() + season() | 0.56 | 142.20 | 828.98 | 832.59 | 878.97 |
| obs ~ ws + atmos_pres + visibility | 0.53 | 140.42 | 829.29 | 829.66 | 844.91 |
| obs ~ atmos_pres + visibility + trend() + season() | 0.56 | 142.66 | 829.72 | 833.32 | 879.71 |
| obs ~ atmos_pres + visibility | 0.52 | 141.97 | 830.41 | 830.66 | 842.91 |
| obs ~ ws + visibility + RH | 0.53 | 139.06 | 830.43 | 830.80 | 846.05 |
| obs ~ atmos_pres + visibility + RH + trend() + season() | 0.55 | 145.65 | 831.65 | 835.73 | 884.76 |
| obs ~ RH + dew_point + trend() + season() | 0.54 | 146.35 | 834.60 | 838.21 | 884.59 |
| obs ~ RH + visibility | 0.51 | 143.83 | 835.83 | 836.07 | 848.32 |
| obs ~ ws + visibility | 0.51 | 143.62 | 836.37 | 836.61 | 848.86 |
| obs ~ dew_point + trend() + season() | 0.53 | 147.61 | 836.76 | 839.92 | 883.62 |
| obs ~ RH + dew_point | 0.50 | 146.56 | 837.65 | 837.89 | 850.14 |
| obs ~ dew_point | 0.49 | 147.61 | 839.43 | 839.58 | 848.80 |
| obs ~ visibility | 0.49 | 147.32 | 840.22 | 840.36 | 849.59 |
| obs ~ RH + visibility + trend() + season() | 0.51 | 152.53 | 844.95 | 848.55 | 894.93 |
| obs ~ visibility + trend() + season() | 0.51 | 152.36 | 845.24 | 848.40 | 892.10 |
| obs ~ RH + atmos_pres + trend() + season() | 0.50 | 160.45 | 848.06 | 851.66 | 898.04 |
| obs ~ atmos_pres + trend() + season() | 0.49 | 161.63 | 850.70 | 853.86 | 897.56 |
| obs ~ RH + trend() + season() | 0.44 | 173.89 | 867.75 | 870.91 | 914.61 |
| obs ~ ws + atmos_pres + RH | 0.29 | 211.82 | 898.92 | 899.29 | 914.54 |
| obs ~ RH + atmos_pres | 0.27 | 214.75 | 900.65 | 900.89 | 913.15 |
| obs ~ atmos_pres + ws | 0.17 | 243.21 | 922.32 | 922.56 | 934.81 |
| obs ~ RH | 0.17 | 239.55 | 922.38 | 922.52 | 931.75 |
| obs ~ RH + ws | 0.17 | 239.02 | 922.58 | 922.83 | 935.08 |
| obs ~ atmos_pres | 0.13 | 257.14 | 931.07 | 931.22 | 940.44 |
| obs ~ ws | 0.04 | 276.44 | 947.31 | 947.46 | 956.68 |
| obs^2 ~ ws + atmos_pres + dew_point + RH + trend() + season() | 0.49 | 2392560.97 | 2452.42 | 2457.01 | 2508.65 |
| obs^2 ~ dew_point + atmos_pres + RH + visibility + ws + trend() + , season() | 0.48 | 2408392.05 | 2454.40 | 2459.53 | 2513.75 |
| obs^2 ~ ws + atmos_pres + dew_point + trend() + season() | 0.46 | 2433782.28 | 2459.12 | 2463.20 | 2512.22 |
| obs^2 ~ ws + atmos_pres + visibility + dew_point + trend() + , season() | 0.46 | 2435375.87 | 2460.16 | 2464.75 | 2516.39 |
| obs^2 ~ ws + dew_point + RH + trend() + season() | 0.46 | 2429097.90 | 2461.62 | 2465.70 | 2514.73 |
| obs^2 ~ dew_point + ws + trend() + season() | 0.45 | 2429890.08 | 2462.65 | 2466.26 | 2512.64 |
| obs^2 ~ ws + dew_point + visibility + RH + trend() + season() | 0.45 | 2441721.09 | 2463.62 | 2468.21 | 2519.85 |
| obs^2 ~ ws + visibility + dew_point + trend() + season() | 0.45 | 2435669.59 | 2464.12 | 2468.20 | 2517.23 |
| obs^2 ~ ws + atmos_pres + visibility + dew_point | 0.41 | 2474090.34 | 2465.32 | 2465.84 | 2484.06 |
| obs^2 ~ dew_point + atmos_pres + RH + visibility + ws | 0.40 | 2493988.79 | 2467.25 | 2467.95 | 2489.12 |
| obs^2 ~ ws + atmos_pres + dew_point + RH | 0.40 | 2501729.88 | 2467.41 | 2467.93 | 2486.15 |
| obs^2 ~ dew_point + atmos_pres + RH + visibility + trend() + , season() | 0.44 | 2603047.74 | 2467.50 | 2472.09 | 2523.73 |
| obs^2 ~ atmos_pres + dew_point + RH + trend() + season() | 0.43 | 2612501.40 | 2467.88 | 2471.96 | 2520.99 |
| obs^2 ~ ws + atmos_pres + dew_point | 0.39 | 2512154.08 | 2468.42 | 2468.79 | 2484.04 |
| obs^2 ~ ws + visibility + dew_point | 0.39 | 2457053.03 | 2468.74 | 2469.12 | 2484.36 |
| obs^2 ~ ws + atmos_pres + RH + trend() + season() | 0.43 | 2594236.62 | 2469.23 | 2473.31 | 2522.34 |
| obs^2 ~ atmos_pres + ws + trend() + season() | 0.42 | 2578338.23 | 2470.36 | 2473.97 | 2520.35 |
| obs^2 ~ ws + dew_point + visibility + RH | 0.39 | 2472245.17 | 2470.74 | 2471.26 | 2489.48 |
| obs^2 ~ ws + atmos_pres + visibility + RH + trend() + season() | 0.43 | 2602170.85 | 2470.83 | 2475.42 | 2527.06 |
| obs^2 ~ ws + atmos_pres + visibility + trend() + season() | 0.42 | 2570907.10 | 2470.91 | 2474.99 | 2524.02 |
| obs^2 ~ ws + dew_point + RH | 0.38 | 2511200.00 | 2472.29 | 2472.66 | 2487.91 |
| obs^2 ~ dew_point + ws | 0.37 | 2516630.35 | 2472.88 | 2473.12 | 2485.37 |
| obs^2 ~ atmos_pres + visibility + dew_point + trend() + season() | 0.41 | 2656249.06 | 2474.56 | 2478.64 | 2527.67 |
| obs^2 ~ ws + trend() + season() | 0.40 | 2586244.23 | 2477.27 | 2480.42 | 2524.13 |
| obs^2 ~ atmos_pres + visibility + dew_point | 0.36 | 2669435.51 | 2477.65 | 2478.02 | 2493.27 |
| obs^2 ~ ws + visibility + trend() + season() | 0.39 | 2584816.57 | 2478.43 | 2482.03 | 2528.41 |
| obs^2 ~ RH + ws + trend() + season() | 0.39 | 2617319.62 | 2479.17 | 2482.77 | 2529.15 |
| obs^2 ~ dew_point + atmos_pres + RH + visibility | 0.36 | 2691273.12 | 2479.59 | 2480.11 | 2498.33 |
| obs^2 ~ atmos_pres + dew_point + RH | 0.35 | 2698005.86 | 2479.60 | 2479.97 | 2495.22 |
| obs^2 ~ dew_point + atmos_pres | 0.34 | 2707205.06 | 2480.38 | 2480.63 | 2492.88 |
| obs^2 ~ ws + visibility + RH + trend() + season() | 0.39 | 2615643.56 | 2480.43 | 2484.51 | 2533.54 |
| obs^2 ~ visibility + dew_point + RH + trend() + season() | 0.38 | 2720661.81 | 2482.57 | 2486.65 | 2535.68 |
| obs^2 ~ dew_point + visibility + trend() + season() | 0.38 | 2710909.47 | 2482.84 | 2486.44 | 2532.82 |
| obs^2 ~ dew_point + atmos_pres + trend() + season() | 0.38 | 2794948.80 | 2482.91 | 2486.51 | 2532.89 |
| obs^2 ~ ws + atmos_pres + visibility | 0.34 | 2722084.94 | 2483.28 | 2483.65 | 2498.90 |
| obs^2 ~ ws + atmos_pres + visibility + RH | 0.34 | 2734097.69 | 2484.40 | 2484.92 | 2503.14 |
| obs^2 ~ atmos_pres + visibility | 0.33 | 2763319.67 | 2484.86 | 2485.11 | 2497.36 |
| obs^2 ~ RH + dew_point + trend() + season() | 0.37 | 2779800.59 | 2485.43 | 2489.03 | 2535.41 |
| obs^2 ~ atmos_pres + visibility + RH | 0.32 | 2783587.02 | 2486.42 | 2486.79 | 2502.04 |
| obs^2 ~ dew_point + visibility | 0.32 | 2709209.25 | 2486.74 | 2486.98 | 2499.23 |
| obs^2 ~ visibility + dew_point + RH | 0.32 | 2724697.97 | 2488.66 | 2489.03 | 2504.28 |
| obs^2 ~ atmos_pres + visibility + RH + trend() + season() | 0.36 | 2895426.61 | 2489.03 | 2493.11 | 2542.13 |
| obs^2 ~ atmos_pres + visibility + trend() + season() | 0.35 | 2866756.51 | 2489.30 | 2492.91 | 2539.29 |
| obs^2 ~ ws + visibility + RH | 0.31 | 2717515.96 | 2490.01 | 2490.38 | 2505.63 |
| obs^2 ~ ws + visibility | 0.31 | 2723288.93 | 2490.16 | 2490.41 | 2502.66 |
| obs^2 ~ dew_point + trend() + season() | 0.35 | 2847091.68 | 2490.67 | 2493.83 | 2537.53 |
| obs^2 ~ RH + dew_point | 0.30 | 2777536.37 | 2490.70 | 2490.94 | 2503.19 |
| obs^2 ~ dew_point | 0.30 | 2779114.53 | 2490.86 | 2491.01 | 2500.23 |
| obs^2 ~ visibility | 0.28 | 2800157.59 | 2494.60 | 2494.74 | 2503.97 |
| obs^2 ~ RH + visibility | 0.28 | 2804617.44 | 2495.16 | 2495.40 | 2507.65 |
| obs^2 ~ RH + atmos_pres + trend() + season() | 0.33 | 3016097.26 | 2495.49 | 2499.09 | 2545.47 |
| obs^2 ~ atmos_pres + trend() + season() | 0.30 | 3084874.67 | 2501.85 | 2505.01 | 2548.71 |
| obs^2 ~ visibility + trend() + season() | 0.29 | 2961805.49 | 2504.01 | 2507.17 | 2550.87 |
| obs^2 ~ RH + visibility + trend() + season() | 0.29 | 2994589.85 | 2505.86 | 2509.46 | 2555.84 |
| obs^2 ~ RH + atmos_pres | 0.21 | 3232600.63 | 2511.81 | 2512.05 | 2524.30 |
| obs^2 ~ ws + atmos_pres + RH | 0.21 | 3245335.16 | 2513.49 | 2513.86 | 2529.11 |
| obs^2 ~ RH + trend() + season() | 0.23 | 3201193.91 | 2517.67 | 2520.83 | 2564.53 |
| obs^2 ~ atmos_pres + ws | 0.13 | 3539680.30 | 2528.26 | 2528.51 | 2540.76 |
| obs^2 ~ atmos_pres | 0.12 | 3585321.46 | 2529.32 | 2529.47 | 2538.69 |
| obs^2 ~ RH | 0.11 | 3453464.19 | 2531.19 | 2531.33 | 2540.56 |
| obs^2 ~ RH + ws | 0.10 | 3475305.85 | 2533.18 | 2533.42 | 2545.67 |
| obs^2 ~ ws | 0.00 | 3826031.53 | 2549.70 | 2549.85 | 2559.08 |
Wybieram z nich modele do dalszego porównania:
log10(obs) ~ ws + atmos_pres + dew_point + trend() + season()))
log10(obs) ~ dew_point + ws + trend() + season()))
log10(obs) ~ ws + visibility + RH + trend() + season()))
log(obs) ~ ws + atmos_pres + dew_point + trend() + season()))
fit_m1 <- train %>%
model(TSLM(formula = log10(obs) ~ ws + atmos_pres + dew_point + trend() + season()))
fit_m2 <- train %>%
model(TSLM(formula = log10(obs) ~ dew_point + ws + trend() + season()))
fit_m3 <- train %>%
model(TSLM(formula = log10(obs) ~ ws + visibility + RH + trend() + season()))
fit_m4 <- train %>%
model(TSLM(formula = log(obs) ~ ws + atmos_pres + dew_point + trend() + season())) Model ETS to model tworzony przy użyciu metody wygładzenia wykładniczego, gdzie prognozy są ważonymi średnimi z poprzednich obserwacji, a wagi maleją wykładniczo z wiekiem obserwacji. Model ten ma minimalizowac AIC. AIC: 1663.651
fit_ets <- train %>%
model(ETS(obs))
fit_ets %>% report()## Series: obs
## Model: ETS(M,N,M)
## Smoothing parameters:
## alpha = 0.05456618
## gamma = 0.0001034682
##
## Initial states:
## l[0] s[0] s[-1] s[-2] s[-3] s[-4] s[-5] s[-6]
## 41.09406 1.2268 1.072638 0.9961869 1.121667 0.6531318 0.5883658 0.5550555
## s[-7] s[-8] s[-9] s[-10] s[-11]
## 0.626941 0.95563 1.193036 1.53211 1.478437
##
## sigma^2: 0.0776
##
## AIC AICc BIC
## 1663.651 1666.809 1710.510
Model Arima ma na celu opisanie autokorelacji danych. AIC w tym przypadku wyniosło 1391.58.
fit_arima <- train %>%
model(ARIMA(obs))
fit_arima %>% report()## Series: obs
## Model: ARIMA(0,0,2)(0,0,2)[12] w/ mean
##
## Coefficients:
## ma1 ma2 sma1 sma2 constant
## 0.3287 0.1761 0.1491 0.1331 38.2222
## s.e. 0.0805 0.0734 0.0823 0.0749 2.1289
##
## sigma^2 estimated as 221.3: log likelihood=-689.79
## AIC=1391.58 AICc=1392.1 BIC=1410.32
AIC w tym przypadku wyniosło 1298.81.
fit_dyn<- train %>%
model(ARIMA(obs ~ dew_point + I(dew_point^2)))
fit_dyn %>% report()## Series: obs
## Model: LM w/ ARIMA(1,0,0) errors
##
## Coefficients:
## ar1 dew_point I(dew_point^2) intercept
## 0.2257 -2.5128 0.0832 44.7907
## s.e. 0.0759 0.2320 0.0217 1.4745
##
## sigma^2 estimated as 128.7: log likelihood=-644.4
## AIC=1298.81 AICc=1299.18 BIC=1314.43
Model przetestowano dla K ={1,..,6}. Najlepsze okazały się K=5, K=6.
fit_harm <- train %>%
model(
`K = 1` = ARIMA(log(obs) ~ fourier(K = 1) + PDQ(0,0,0)),
`K = 2` = ARIMA(log(obs) ~ fourier(K = 2) + PDQ(0,0,0)),
`K = 3` = ARIMA(log(obs) ~ fourier(K = 3) + PDQ(0,0,0)),
`K = 4` = ARIMA(log(obs) ~ fourier(K = 4) + PDQ(0,0,0)),
`K = 5` = ARIMA(log(obs) ~ fourier(K = 5) + PDQ(0,0,0)),
`K = 6` = ARIMA(log(obs) ~ fourier(K = 6) + PDQ(0,0,0))
)
fit_harm %>%
forecast(h = "1 year") %>%
autoplot(train, size = 1, level = 80) +
facet_wrap(vars(.model), ncol = 2) +
guides(colour = FALSE) +
geom_label(
aes(x = yearmonth("2019 Jan"), y = 100, label = paste0("AICc = ", format(AICc))),
data = glance(fit_ar_fiur)
)fit_harm %>% accuracy()## # A tibble: 6 × 10
## .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 K = 1 Training 0.623 12.8 7.74 -4.98 18.9 0.702 0.676 -0.0277
## 2 K = 2 Training 0.766 12.4 7.23 -4.06 17.5 0.656 0.653 0.0540
## 3 K = 3 Training 0.766 12.4 7.22 -4.06 17.4 0.655 0.653 0.0540
## 4 K = 4 Training 0.774 12.4 7.18 -4.02 17.3 0.651 0.652 0.0552
## 5 K = 5 Training 0.772 12.3 7.19 -4.02 17.4 0.652 0.650 0.0598
## 6 K = 6 Training 0.770 12.3 7.19 -4.02 17.4 0.652 0.650 0.0591
Najlepsze dopasowanie wykresu funkcji do danych niż w przypadku innych modeli jest przy regresji wielorakiej oraz modelach m1,m2,m3,m4.
bind_rows(fit %>%
augment() %>%
mutate(nazwa = "Liniowa"),
fit_multi %>%
augment() %>%
mutate(nazwa = "Wieloraka"),
fit_Radom %>%
augment() %>%
mutate(nazwa = "Season + trend"),
fit_k2 %>%
augment() %>%
mutate(nazwa = "Fourier k2"),
fit_arima %>%
augment() %>%
mutate(nazwa = "Arima"),
fit_ets %>%
augment() %>%
mutate(nazwa = "ETS"),
fit_dyn %>%
augment() %>%
mutate(nazwa = "regresja dynamiczna"),
fit_m1 %>%
augment() %>%
mutate(nazwa = "log10(obs) ~ ws + atmos_pres + dew_point + trend() + season()"),
fit_m2 %>%
augment() %>%
mutate(nazwa = "log10(obs) ~ dew_point + ws + trend() + season()"),
fit_m3 %>%
augment() %>%
mutate(nazwa = "log10(obs) ~ ws + visibility + RH + trend() + season()"),
fit_m4 %>%
augment() %>%
mutate(nazwa = "log(obs) ~ ws + atmos_pres + dew_point + trend() + season()"),
) -> dopasowanie
dopasowanie %>%
ggplot(aes(x = date)) +
geom_line(aes(y = .fitted, color = "model")) +
geom_line(aes(y = obs, color = "dane")) +
facet_wrap(~nazwa, ncol = 1) +
labs(color = "Reprezentacja")Modele regresji wielorakiej oraz modelach m1,m2,m3,m4 są najlepiej wpasowanie do linii.
dopasowanie %>%
ggplot(aes(x = obs, y = .fitted, color = nazwa)) +
geom_point() +
geom_abline(intercept = 0, slope = 1) +
facet_wrap(~nazwa, ncol = 1)Najwyższą korelacje do danych PM10 wykazuje model 1 oraz 4 czyli: M1 - log10(obs) ~ ws + atmos_pres + dew_point + trend() + season(), M4 - log(obs) ~ ws + atmos_pres + dew_point + trend() + season()
dopasowanie %>%
as.data.frame() %>%
group_by(nazwa) %>%
summarise(korelacja = cor(obs, .fitted),
res_mean = mean(.resid),
res_med = median(.resid))%>%
arrange(korelacja) %>%
knitr::kable(digits = 2)| korelacja | res_mean | res_med |
|---|---|---|
| 0.75 | 0.16 | -0.91 |
##Test normalności modeli Wszystkie modele posiadają rozkłąd normalny oprócz modeli: m1, m2, m3, m4
# f1=fit%>% residuals() %>% pull(.resid) %>%shapiro.test();f1
# f2=fit_multi %>% residuals() %>% pull(.resid) %>%shapiro.test();f2
# f3=fit_Radom %>% residuals() %>% pull(.resid) %>%shapiro.test();f3
# f4=fit_k2 %>% residuals() %>% pull(.resid) %>%shapiro.test();f4
# f5=fit_arima %>% residuals() %>% pull(.resid) %>%shapiro.test();f5
# f6=fit_ets %>% residuals() %>% pull(.resid) %>%shapiro.test();f6
# f7=fit_dyn %>% residuals() %>% pull(.resid) %>%shapiro.test();f7
# f8=fit_m1 %>% residuals() %>% pull(.resid) %>%shapiro.test();f8
# f9=fit_m2 %>% residuals() %>% pull(.resid) %>%shapiro.test();f9
# f10=fit_m3 %>% residuals() %>% pull(.resid) %>%shapiro.test();f10
# f11=fit_m4 %>% residuals() %>% pull(.resid) %>%shapiro.test();f11
fit_multi %>% residuals() %>% pull(.resid) %>%shapiro.test()##
## Shapiro-Wilk normality test
##
## data: .
## W = 0.7637, p-value = 3.638e-15
normalnosc <- data.frame(model = c("Regresja liniowa prosta", "Regresja wieloraka", "Season + trend", "Fourier k2", "Arima", "ETS", "Regresja dynamiczna", "m1", "m2", "m3", "m4"), pvalue=c(9.637e-09, 2.638e-05, 9.316e-12, 6.815e-12, 2.256e-11, 1.737e-05, 2.348e-07, 0.7278, .74, 0.1148, 0.7278))
datatable(normalnosc)t1 <- fit%>%
augment() %>%
features(.resid, ljung_box, lag = 20, dof = 2)
t2 <- fit_multi %>%
augment() %>%
features(.resid, ljung_box, lag = 20, dof = 2)
t3 <- fit_Radom %>%
augment() %>%
features(.resid, ljung_box, lag = 20, dof = 2)
t4 <- fit_k2 %>%
augment() %>%
features(.resid, ljung_box, lag = 20, dof = 2)
t5 <- fit_ets %>%
augment() %>%
features(.resid, ljung_box, lag = 20, dof = 2)
t6 <- fit_arima %>%
augment() %>%
features(.resid, ljung_box, lag = 20, dof = 2)
t7 <- fit_dyn %>%
augment() %>%
features(.resid, ljung_box, lag = 20, dof = 2)
t8 <- fit_m1 %>%
augment() %>%
features(.resid, ljung_box, lag = 20, dof = 2)
t9 <- fit_m2 %>%
augment() %>%
features(.resid, ljung_box, lag = 20, dof = 2)
t10 <- fit_m3 %>%
augment() %>%
features(.resid, ljung_box, lag = 20, dof = 2)
t11 <- fit_m4 %>%
augment() %>%
features(.resid, ljung_box, lag = 20, dof = 2)
t <- data.frame(rbind(t1,t2,t3,t4, t5, t6, t7, t8, t9, t10, t11))
datatable(t)#ej mordo tutaj chyba odrzucamy hipoteze zerowa, czyli reszty sa ze soba skorelowane
# Odrzucamy hipotezę alternatywną o autokorelacji reszt. Podrzymujemy hipotezę
# zerową, że reszty nie są z sobą skorelowane (pvalu > 0.05)Wyjwyższe R^2 posiadał model wieloraki.
porownanie <- map(.x = list(fit, fit_multi, fit_Radom, fit_k2),
.f = ~glance(.x) %>%
dplyr::select(adj_r_squared, CV, AIC, AICc, BIC)
) %>%
do.call(rbind, .)
porownanie <- cbind(c("Liniowa","Wieloraka", "trend + season", "Fourier K=2"), porownanie)
names(porownanie)[1] <- "Model"
datatable(porownanie)test_1 <- fit %>% forecast(new_data = test)
test_2 <- fit_multi %>% forecast(new_data = test)
test_3 <- fit_k2 %>% forecast(new_data = test)
test_4 <- fit_arima %>% forecast(new_data = test)
test_5 <- fit_ets %>% forecast(new_data = test)
test_6 <- fit_dyn %>% forecast(new_data = test)
stats <- bind_rows(
fit %>% accuracy(),
fit_multi %>% accuracy(),
fit_k2 %>% accuracy(),
fit_arima %>% accuracy(),
fit_ets %>% accuracy(),
fit_dyn %>% accuracy(),
test_1 %>% accuracy(test),
test_2 %>% accuracy(test),
test_3 %>% accuracy(test),
test_4 %>% accuracy(test),
test_5 %>% accuracy(test),
test_6 %>% accuracy(test)
)
stats <- stats %>% mutate_at(vars(RMSE, MAE, MPE, MAPE, MASE, RMSSE, ACF1), funs(round(., 3)))
datatable(stats)Wybraliśmy najlepsze modele do dalszej analizy modele: regresji wielorakiej, regresji dynamicznej, regresji liniowej
Posiadały one najdokładniejsze dopasowania do rzeczywistego stężenia PM10 oraz najlepsze wyniki w testach. Dodatkowo spełniały warunek normalności.
Analizujemy również reszty tych modeli. W wybranych modelach nie wystepuje autokorelacja reszt. Wariacja bliska zeru.
fit%>%
gg_tsresiduals()fit_multi%>%
gg_tsresiduals()fit_dyn%>%
gg_tsresiduals()Najlepszą okazała się prognoza modelu regresji wielorakiej.
prog1<- fit_multi %>%
forecast(test, h = "6 month") %>%
autoplot(Radom_m) +
labs(title = "Wieloraka", y = "PM10")
prog2<-fit_dyn %>%
forecast(test, h = "6 month") %>%
autoplot(Radom_m) +
labs(title = "Dynamiczna", y = "PM10")
prog3<-fit %>%
forecast(test, h = "6 month") %>%
autoplot(Radom_m) +
labs(title = "Liniowa prosta", y = "PM10")
grid.arrange(prog1, prog2, prog3, ncol = 1)Metoda średniej
# Poniżej przedstawimy krótką prezentację 4 podstawowych i bardzo prostych
# metody prognozowania, które bywaja często skuteczne i wystarczające.
# **Metoda średniej** - prognozy przyszłych wartości są równe wartości średniej
# z danych historycznych
m1 <- train %>% model(MEAN(obs))Metoda Naiwne (prosta)
# **Metoda Naiwne (prosta)** - 'Naiwne' ponieważ zakłądają, że czynniki określające
# zmienną prognozowaną są stałe (niezmienne). Prognozą jest wartość ostatniej
# obserwacji. Tak zwane prognozy losowego marszu.
m2 <- train %>% model(NAIVE(obs))Metoda Naiwne (sezoowa)
# **Metoda naiwna (sozenowa)** - prognoza jest równan ostatniej obserwacji z
# każdego sezonu (pory roku, miesiaca, tygodnia itd..)
m3 <- train %>% model(SNAIVE(obs ~ lag("year")))Metoda dryfu (naiwna)
# **Metaoda dryfu (naiwna)** - prognozy mogą rosnąć lub maleć z czasem, przy
# czym zmianan ta jest średnią zmianą danych historycznych.
m4 <- train %>% model(RW(obs ~ drift()))Najlepszą z prognoz okazała się SNAIVE.
# wyniki prognoz
gridExtra::grid.arrange(m1 %>% forecast() %>% autoplot(train) + ggtitle("MEAN"),
m2 %>% forecast() %>% autoplot(train) + ggtitle("NAIVE"),
m3 %>% forecast() %>% autoplot(train) + ggtitle("SNAIVE"),
m4 %>% forecast() %>% autoplot(train) + ggtitle("RW"))# wyniki prognoz
gridExtra::grid.arrange(m1 %>% forecast() %>% autoplot(Radom_m) + ggtitle("MEAN"),
m2 %>% forecast() %>% autoplot(Radom_m) + ggtitle("NAIVE"),
m3 %>% forecast() %>% autoplot(Radom_m) + ggtitle("SNAIVE"),
m4 %>% forecast() %>% autoplot(Radom_m) + ggtitle("RW"))Najlepszym modelem okazał się model regresji wielorakiej, przez całą analizę posiadał bardzo dobre parametry oraz posiadał najlepszą prognozę. Zadowalające wyniki osiągnęły również modele regresji prostej liniowej oraz regresji dynamicznej. Z prognozowania prostymi metodami okazał się najlepszy model SNAIVE.